** Data reading and variable selection from raw data
** UK National Heights and Weights Survey 1980

** 01. Reading data **

cap log close
clear all
cd /*insert you work directory here*/
use /*insert you work directory here*/
set more off
numlabel, add


** 02. Consructing year and country variables **

ge year=1980
lab var year "survey year"

ge country=826
lab var country "ISO country code"
//uk: 826 


** 03. ID variables **

ge pid=v1
lab var pid "person id"

ge hid=v3*1000+v4*10+v5
lab var hid "household id"


** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=v8
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge birthyr=.
replace birthyr=1900+v11 if inrange(v11,0,64)  /* v11 last two digits of birth year */
replace birthyr=1800+v11 if inrange(v11,68,99)
lab var birthyr "year of birth"

ge age=v12
lab var age "age"  /* age last birthday */


** 05. Siblings **

ge nsibs=v199  /* number of siblings including R among R who reported non-zero siblings */
recode nsib (-2/-1=.)  /* recode -2 & -1 to missing */
replace nsibs=nsibs-1
lab var nsibs "number of siblings"

* recode nsibs to 0 for R who reported no siblings (v198==1)
replace nsibs=0 if v198==1

* birth order
ge birthorder=v200
recode birthorder (-2/-1=.)
lab var birthorder "birth order"
* recode birthorder to 1 for R who reported no siblings (v198==1)
replace birthorder=1 if v198==1


** 06. Own education **

// No schooling information

** 07. Parents' education: Father and/or Mother **

// No parents' education available

** 08. Own occupation **

ge class=v144  /* by current job, if retired primary job before retirement */
lab def class 1 "professional occ" 2 "intermediate occ" 3 "skilled non-manual" ///
			  4 "skilled manual" 5 "partly skilled occ" 6 "unskilled occ" ///
			  7 "n/a" 
lab val class class
lab var class "R's own social class"

ge class2=v145 /* by current job, if retired primary job before retirement */
#delimit ;
lab def class2
1 "employers, managers - large firms"
2 "employers, managers - small firms"
3 "professional - self-employed"
4 "professional - employees"
5 "intermediate non-manual workers"
6 "junior non-manual workers"
7 "personal service workers"
8 "foremen and supervisors - manual"
9 "skilled manual workers"
10 "semi-skilled manual workers"
11 "unskilled manual workers"
12 "own account workers (other than professional)"
13 "farmers - employers, managers"
14 "farmers - own account"
15 "agricultural workers"
16 "members of armed forces"
17 "inadequately described";
#delimit cr
lab val class2 class2
lab var class2 "R's own social class (detailed)"

** 09. Parents' occupation **

ge faclass=v148
lab val faclass class
lab var faclass "father's social class"


recode class class2  faclass (-2/-1=.)


** 10. Tabulate the Identified Variables **

numlabel, add
log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** UK National Heights and Weights Survey 1980


** Sex **
tab sex,m

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs birthorder, d


** R's Own Occupation **
tab1 class class2 ,m

** Parental Occupation **
tab1 faclass,m

log close


** 11. Keep the identified variables only

keep year country pid hid ///
	 sex age birthyr ///
	 nsibs birthorder ///
	 class class2 faclass

** 12. Save the Data File **

saveold /*insert you work directory here*/, replace
